Skip to content

ARROW-12165: [Rust] inline append functions of builders#9860

Closed
ritchie46 wants to merge 1 commit into
apache:masterfrom
ritchie46:inline_builder_appends
Closed

ARROW-12165: [Rust] inline append functions of builders#9860
ritchie46 wants to merge 1 commit into
apache:masterfrom
ritchie46:inline_builder_appends

Conversation

@ritchie46

Copy link
Copy Markdown
Contributor

The append functions in the Builder structs are often used in "hot" code. This PR tags them with #[inline], making it possible to inline the function calls across crate boundaries.

@nevi-me nevi-me left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, there's lots of appending going on here 😵

@Dandandan

Copy link
Copy Markdown
Contributor

@ritchie46 did you have some perf results on this?
I think it definitely makes sense, especially as the functions are wrapped in (unnecessary) Result now which might make things worse. Sometimes inlining makes things worse performance wise, that's why I think we should have some numbers.

Comment thread rust/arrow/src/array/builder.rs Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a large function to inline? Maybe we need to be conservative here and not mark it as inline

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I will remove that one.

@ritchie46

ritchie46 commented Mar 31, 2021

Copy link
Copy Markdown
Contributor Author

@ritchie46 did you have some perf results on this?
I think it definitely makes sense, especially as the functions are wrapped in (unnecessary) Result now which might make things worse. Sometimes inlining makes things worse performance wise, that's why I think we should have some numbers.

There are quite some functions, but for met the most important one is the primitive builder.

Running this benchmark:

fn create_primitive_array(n: usize) -> PrimitiveArray<Int64Type> {
    let mut builder = PrimitiveBuilder::new(n);

    for i in 0..n {
        builder.append_value(i as i64).unwrap();
    }

    builder.finish()
}

fn add_benchmark(c: &mut Criterion) {
    c.bench_function("512", |b| b.iter(|| create_primitive_array(512)));
    c.bench_function("4096", |b| b.iter(|| create_primitive_array(4096)));
}

Gave the following improvement:

PrimitiveBuilder

Gnuplot not found, using plotters backend
512                     time:   [848.60 ns 848.70 ns 848.81 ns]                 
                        change: [-54.476% -54.424% -54.383%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  1 (1.00%) high mild
  3 (3.00%) high severe

4096                    time:   [5.3370 us 5.3449 us 5.3539 us]                  
                        change: [-62.787% -62.696% -62.617%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

StringBuilder

For a StringBuilder it shows the following results. Still improvements, but less, because the function call is a smaller part of the work.

Gnuplot not found, using plotters backend
512                     time:   [6.1797 us 6.1825 us 6.1851 us]                 
                        change: [-9.0882% -9.0515% -9.0141%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

4096                    time:   [37.948 us 37.957 us 37.965 us]                  
                        change: [-13.363% -13.248% -13.149%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

@ritchie46 ritchie46 force-pushed the inline_builder_appends branch from fcd2dee to 1cb93ee Compare March 31, 2021 11:38
@ritchie46

Copy link
Copy Markdown
Contributor Author

especially as the functions are wrapped in (unnecessary) Result now which might make things worse.

Is there a reason not to remove this? Backwards incompatibility changes are already happened, so maybe we can remove this?

@github-actions

Copy link
Copy Markdown

@Dandandan

Copy link
Copy Markdown
Contributor

@ritchie46 quite nice micro benchmark results 👍

Is there a reason not to remove this? Backwards incompatibility changes are already happened, so maybe we can remove this?

I don't think there is any reason. I tried to do it some time ago, but it requires a lot of work as it is used in quite some code as you can imagine.

@codecov-io

Copy link
Copy Markdown

Codecov Report

Merging #9860 (1cb93ee) into master (4de0ed7) will decrease coverage by 0.00%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #9860      +/-   ##
==========================================
- Coverage   82.61%   82.60%   -0.01%     
==========================================
  Files         254      255       +1     
  Lines       59583    59588       +5     
==========================================
  Hits        49225    49225              
- Misses      10358    10363       +5     
Impacted Files Coverage Δ
rust/arrow/src/array/builder.rs 85.22% <ø> (ø)
rust/parquet/benches/arrow_writer.rs 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7a0d334...1cb93ee. Read the comment docs.

@alamb

alamb commented Mar 31, 2021

Copy link
Copy Markdown
Contributor

I don't think there is any reason. I tried to do it some time ago, but it requires a lot of work as it is used in quite some code as you can imagine.

Yeah I agree it would make the API easier to work with if the builders didn't return Result. I think it would be worth the backwards compatibility hit but as @Dandandan says it is a lot of work

@alamb alamb left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ritchie46 -- this looks great to me

@codecov-commenter

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.60%. Comparing base (4de0ed7) to head (1cb93ee).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9860      +/-   ##
==========================================
- Coverage   82.61%   82.60%   -0.01%     
==========================================
  Files         254      255       +1     
  Lines       59583    59588       +5     
==========================================
  Hits        49225    49225              
- Misses      10358    10363       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants